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Abstract. Fhe El Farol bar model, proposed to study the dynamics of competition of agents in a variety of 
contexts (W. B. Arthur, Amer. Econ. Assoc. Pap. and Proc. 84, 406 (1994)) is studied. We characterize in 
detail the three regions of the phase diagram (efficient, better than random and inefficient) of the simplest 
version of the model (D. Challet and Y.-C. Zhang, Physica A, 246, 407 (1997)). The efficient region is 
shown to have a rich structure, which is investigated in some detail. Changes in the payoff function enhance 
further the tendency of the model towards a wasteful distribution of resources. 

PACS. 02.50.-r Probability theory, stochastic processes, and statistics - 02.50.Ga Markov processes - 
05.40.+j Fluctuation phenomena, random processes, and Brownian motion 



1 Introduction 

■ In recent years there has been a growing interest in under- 
standing the dynamics of systems of interacting individ- 
uals with competing goals (frustration). Simple rules for 
the behavior of the individuals may lead to unexpected 
properties in the behavior of the collectivity. These rather 
general premises can apply to problems in different fields, 
like economy Jl| , ecology or physics ||] . 

To illustrate these facts Brian Arthur introduced what 
he called "El Farol" bar problem (EFBP) (§. N individ- 
uals decide, at each time step, to go to a bar or to stay at 
home. The bar is enjoyable only if the attendance does not 
surpass some critical number, that can be thought of as 
some kind of comfort capacity. But each individual does 

| not know beforehand what is going to happen. To be able 
to make the decision for the next time step the individuals 
(which we will call agents in the following, as in previous 
literature of this model) are provided each one with a set 
of strategies. Using these strategies, and the knowledge of 
what has happened in the portion of the history that they 
can recall, the agents take decisions. 

D. Challet and Y.-C. Zhang || have given a precise 
set of rules which determine the model. The two possible 
choices, going to the bar or staying at home, are repre- 
sented by and 1. A choice is successful if the agents 
which make it are in the minority (comfort capacity = 
50%). The outcome of a given simulation is represented 
by a series of 0's and l's which characterize the successful 
choices at each time step. Each agent uses a fixed set of 
s strategies, taken at random from the pool of all possi- 



ble strategies. Strategies use the full information of the 
m previous outcomes to decide the next move. As there 
are 2 m possible combinations of past events, the number 
of strategies is 2 2 . After each event, the agents update 
the score of their set of strategies. The gain made by the 
successful strategies can either be a fixed constant, or de- 
pend on the size of the group formed at that time step. In 
the simplest version of the model, one point is assigned to 
each successful strategy. When an agent has two or more 
strategies with the same score, one of them is picked at 
random. This choice of payoff is the one discussed in detail 
below. The model is defined by the three parameters: N, 
the number of agents, m, the number of time steps used 
by each strategy in determining the next best move, and 
s, the number of strategies available to each agent. Exten- 
sions to other payoff schemes, similar to those used in (H] [?|] 
are also mentioned. Note that the original work Q used 
a much less constrained set of strategies and a different 
comfort capacity (60%). 

The model, with the set of rules described above, was 
investigated in ||^]. The authors analyze the mean size 
and the fluctuations of the groups taking each of the two 
choices available. It is argued that the model can be char- 
acterized in terms only of the combination p = 2 m /N. 
The average group size is -y. The distribution of sizes is 
symmetrical around this value. The mean quadratic devi- 
ation from the average, <r, is a measure of the number of 
points accumulated by all the agents. This number is max- 
imum when the two groups are almost equal, in which case 
(j ~ O(l). As function of a 2 /N and p three regimes can 
be distinguished, as function of the total number of strate- 
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gies at play ||: i) When p > 1, the number of strategies 
available to the agents is small, and the value of a ap- 
proaches the limit expected when the agents take random 
decisions, a 2 /N = 1/4. ii) If p -C 1, almost all possible 
strategies are in possession of the agents, and their per- 
formance is worst than random, as ^- > |. iii) Finally, for 
p ^ 1 the agents perform statistically better than random. 

2 

The curve of versus p shows a minimum. The authors 
define regime i) as inefficient, as the agents have little in- 
formation, and regime ii) as efficient, as agents have all 
available information at their disposal. 

In section ||, we analyze the model defined above, with 
emphasis on the structure shown in the efficient region. 
Section J3| presents an interpretation of the results. Then, 
section ^ we discuss results obtained by varying the pay- 
off function which determines the choice of strategies. Sec- 
tion analyzes a seemingly trivial variation of the model: 
the majority game, when it becomes preferable to be in 
the majority. The final section presents the conclusions. 



2 Minority game. 

The transition discussed in || is displayed in fig. [I] for 
s = 2 and s — 6. The difference between the efficient and 
inefficient regimes is sharper for small values of s. Each 
simulation of the model starts from a history of length 
m + 3 to initialize the scores of the strategies. The results 
shown in the paper are averages over the 2 m+3 possible 
initial conditions defined in this way. In almost all cases, 
the system evolves towards a steady state which is inde- 
pendent of the initial conditions. 

The peaks in the size distributions are always well ap- 
proximated by Gaussian functions. The large value of a in 
the efficient region is due to the formation of new peaks 
away from N/2. A pictorial view of this effect is shown in 
fig. where the different regimes are studied by varying 
m and s. The attendances have been normalized to one 
in the interval [—1, 1]. In the range of values of p where 
three peaks can be clearly resolved, the weight of the cen- 
tral peak is one half of the total, and the other two peaks 
include one fourth of the recorded attendances. The cen- 
tral peak is always well approximated by a Gaussian of 
width \fN /2 (see also fig. ||), which corresponds to ran- 
dom choices by the agents. 

As one leaves the efficient region, the peaks merge with 
the central one, whose width decreases first and then in- 
creases, to reach the random value for large values of p. 
For small values of p, the peak structure is very rich, and 
seems self similar, as shown in fig. [|. 

As pointed out in ||, it is somewhat unexpected the 
poor performance of the agents when a large amount of in- 
formation is available. It is even more remarkable the rich 
structure shown in fig. ||, which shows that the evolution is 
far from random. This behavior is also consistent with the 
existence of non trivial patterns in the time series, beyond 
the reach of the agents [|| . 

A plot of the attendances at successive times is shown 
in fig. We have chosen the parameters in such a way 
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Fig. 1. Different phases found in the EFBP. The lower part 
shows the evolution of a 2 /N as function of p, circles are for 
s—2 and stars are for s~6. The insets show histograms of the 
attendance number in the different phases, with N = 101: a) 
efficient, m=2, s=2; b) better than random m—6, s=2; and c) 
inefficient m=10, s=2. The top figure shows the difference in 
punctuation between the maximum scored and the minimum 
scored strategies in these three cases: dotted line for (a), dashed 
line for (b), and continuous line for case (c). 



that the distribution of attendances shows three separated 
peaks. 

We have completed the study the evolution of the dif- 
ferent peaks by analyzing their evolution after an initial 
series of random choices. In the time series shown in fig. |^, 
the agents make choices randomly, although their strate- 
gies keep updating the scores. At a given time step (2048), 
the agents start to use the strategies at their disposal. 

The peak structure is robust, and develops immedi- 
ately. As shown in fig. ||, the peaks split from the central 
peak and move to their positions in the steady state dis- 
cussed earlier. 



3 Interpretation. 

The results presented in the previous section allow us to 
gain some understanding of the complex dynamics of the 
efficient regime. In this region, no strategy can stay with 
the highest score for long. The repeated use of a given 
strategy by a significant number of agents leads to the 
raise of other strategies, preferably those more anticorre- 
lated with the one at play. As a result of this, the most 
punctuated strategy (the best considered by the agents) 
has many chances of making its users to loose. And, even- 
tually, the agents segregate into anticorrelated groups when 
some degree of evolution is incorporated 
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Fig. 2. Attendance numbers distributions for N — lOOf 
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Fig. 3. Attendance numbers distribution for A? = 100001, s=4, 
and m=4, normalized in the interval [0, 100001]. (a) Full dis- 
tribution where the y-axis has been truncated in order to ap- 
preciate the spreading of the lateral peaks, (b) Magnification 
of the region marked in (a) with dashed lines, (c) Points in the 
central peak. The continuous line is a Gaussian, centered at 
N/2, with weight half of the total distribution and deviation 
VN/2. 



For simplicity, we now assume that there are two an- 
ticorrelated strategies, x and x which have the highest 
scores most of the time. Let us denote n x and n s the 
number of agents which have strategy x and x. We can 
take n x w n s = n corre i. We now denote as n ran dom the 
number of agents which have neither x nor x. The choices 
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Fig. 4. Attendance in a given group at two successive intervals. 
The parameters used are s — 2, m = 2 and N — 1001. 
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Fig. 5. Attendance number versus time for the game in which 
a transition is forced from a random game to a minority game 
(see text). The parameters of the minority game are: iV=1001, 
s=4, and m=4 (6) for the bottom (top) graphs. 



of these n ran dom agents can be taken to be at random, as 
they are unable to recognize the series which give rise to 
the high scores of x and x. 

When strategy x has the highest score, the two groups 
will have sizes close to n ran dom/2+n corre i and n random /2- 
n CO rreU respectively. This outcome will give no points to 
x, while strategy x, which would have lead to the most 
favorable choice, gains one point. If the score of x remains 
below that of x, the process repeats itself. A steady state is 
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reached when the scores of x and x differ by, at most, one 
point. Then, an outcome with two unequal groups of sizes 

^random/2 ~r" Tl corre [ and TTrandom j '2 Hcorrel is followed 

by the formation of two groups of similar size, ~ N/2. 
The fact that there are n ran dom agents acting at random 
implies that these values are the average of Gaussian peaks 
of similar width. 

We can estimate the value of n corre i from the analysis 
in . We classify the 2 2 ™ strategies into 2 m mutually un- 
corrected, maximally correlated or anticorrelated classes. 
Then, n x « N/2 m = 1/p. 

The previous analysis gives a plausible explanation 
of the three peaks observed throughout most of the ef- 
ficient region of parameter space. It can be extended, in 
a straightforward way, to the case when the dominant 
strategies are more than two. The main new ingredient 
is that there are situations in which two, or more, dom- 
inant strategies can have the same score. Let us imagine 
that the strategies with the highest scores are Xi,X2,Xi 
and X2- Then, at a given instant, the strategy with the 
highest score can be xi, x% but also x\ and x-i (or sim- 
ilar combinations) simultaneously. If, in addition, x\ and 
Xi lead to the same outcome, the majority group will be 
of size n ran dom/2 + n Xl + n X2 . This combination will be, 
probably, less likely, leading to lower peaks further away 
from the average, in agreement with the findings reported 
here. 

We have checked that there is a trivial case where this 
analysis reproduces the observed evolution: m = 2 and 
s = 16, where all agents have all strategies. The atten- 
dance histograms show two sharp peaks at 1 and N, and 
a Gaussian peak with half the weight of the total distri- 
bution at N/2, and deviation \/~N /2. 



4 Varying the rewards for the winners. 

We now look to the effect of changing the way in which the 
different strategies are updated after each outcome. The 
simplest modification is to relate the change in the score 
to the size of the minority group ||,|6). In the following, 
we assume that the payoff, Ap, depends linearly with the 
size, a. If the score is incremented by a, strategies which 
lead to groups with attendances close to N/2 are favored. 
If, on the other hand, the score is incremented by N/2 — a, 
the tendency is the opposite, and strategies which lead to 
very small groups are favored. 

The distributions generated by these two payoff choices 
are plotted in figure |j. The distribution obtained by the 
step payoff discussed in the previous section is also plotted, 
for comparison. 

Contrary to intuition, the two distributions seem to go 
in the opposite direction to what the choice of payoff leads 
to think. It must be noted that, when the second choice of 
payoff function is shifted by a constant, Ap = N/2 — a + k, 
the central peak tends to disappear, and it is replaced 
by two peaks at the sides. This result is similar to other 
findings with a payoff which also favors small groups [|| . 

We interpret the broad structure for the payoff func- 
tion a as due to the swift shuffle of the highest ranking 
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Fig. 6. Attendance distributions for iV=1001, m=4, and s=4. 
Dashed line is for the step payoff, continuous line for Ap — a, 
and dotted line for Ap — N/2 — a. 



strategies. Outcomes with nearly equal groups give rise to 
large changes in the scores of the strategies. Thus, long 
living cycles, of the type described in the previous sec- 
tion, cannot form. The highest ranking strategy changes 
rapidly. As all strategies are in play, groups of many sizes 
are generated, despite the fact that the payoff favors sizes 
close to N/2. 

In the opposite case, with payoff function equal to 
N/2 — a, we ascribe the large peak at N/2 to frequent sit- 
uations when many strategies have the same score. This 
situation is self sustaining, as, when the two groups are of 
sizes N/2 and N/2 + 1, there is no change in the scores of 
the strategies. This is what happens in half of the possible 
2 m + 3 initial conditions, and corresponds to the delta peak 
in fig. ||. The rest of the distribution is a good average of 
what happens in the other half of the initial conditions. 
The shift of the payoff by a constant described earlier re- 
duces the probability of tie-ups, and leads to a double 
peaked distribution. These peaks displaced from the cen- 
ter seem, in this case, related to the two peaks in the step 
payoff case. It is likely that the evolution of the model is 
governed by cycles with a few dominating strategies. 



5 Majority game. 

We have also studied the majority game, in which the 
agents prefer to be in a overcrowded bar or leave the 
bar empty. The methodology is the same as in the mi- 
nority game, in which the different initial conditions tend 
to give similar results. Here, initial conditions may make 
big changes in the attendance distributions. 

Results are trivial (the full majority is attained at 
all time steps) only when all agents have all strategies 
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(s = 2 2 ) . Even in this case, and depending on the initial 
conditions, the group (0 or 1) which obtains the majority 
may oscillate in time. 

The obtained distributions for different values of m 
and, consequently, p, are plotted in fig. ^. 




Fig. 7. The analogous diagram of figure [l] for the majority 
rule. Here iV=101 and s=4. The dashed line is for a m = N/2" 



6 Conclusions. 

As we have seen, the El Farol bar problem has a rich 
structure. We have focused mostly on the behavior in the 
efficient regime, where most of the strategies are at the 
disposal of the agents. As already remarked in S, the 
model has many features in common with frustrated sys- 
tems in statistical mechanics. In particular, most initial 
conditions lead to a poor performance of the system as a 
whole. The model seems unable to select a pool of strate- 
gies such that the global gain by the agents is maximized. 
In particular, those agents which have access to the strate- 
gies with the highest scores at a given moment perform 
worse than those which do not. The latter play basically 
at random, and profit from the unproductive coordination 
of the players using the nominally best strategies. 

This effect seems to remain when the payoff to the dif- 
ferent strategies is varied. It is also remarkable that the in- 
trinsic frustration of the model shows up when the agents 
try to be in the majority. Most initial conditions lead 
to evolutions where the agents fail to coordinate among 
themselves. 
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The particular placement of the fixed points makes 
that a more convenient measure of the efficiency should 
be used. We will use the mean deviation, <r m , calculated 
around the value N for the attendance, and shifting the 
attendances a below N/2 to a + N. Thus, u m also gives 
a measure of the overall gain made by the agents. In the 
three plots of attendances, where the atendance axis is 
not folded, the two large peaks near the limits arc not 
shown. These peaks correspond to limit cycles where the 
attendances do not fluctuate. 

The relative weight of this peak, for s = 4, at suffi- 
ciently large times, is 0.56 for m = 2, 0.078 for m = 6 
and 0.031 for m — 10. The number of agents which are 
able to coordinate among themselves and take part in this 
cycle is, on the average, N — N/2 S , if s < 2 2 . Then, the 
lower limit for cr^/N is N/2 2s . This value is also plotted 
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